Constructing Hypothetical Risk Data from the Area under the ROC Curve: Modelling Distributions of Polygenic Risk
نویسندگان
چکیده
BACKGROUND Modeling studies using hypothetical polygenic risk data can be an efficient tool for investigating the effectiveness of downstream applications such as targeting interventions to risk groups to justify whether empirical investigation is warranted. We investigated the assumptions underlying a method that simulates risk data for specific values of the area under the receiver operating characteristic curve (AUC). METHODS The simulation method constructs risk data for a hypothetical population based on the population disease risk, and the odds ratios and frequencies of genetic variants. By systematically varying the parameters, we investigated under what conditions AUC values represent unique ROC curves with unique risk distributions for patients and nonpatients, and to what extend risk data can be simulated for precise values of the AUC. RESULTS Using larger number of genetic variants each with a modest effect, we observed that the distributions of estimated risks of patients and nonpatients were similar for various combinations of the odds ratios and frequencies of the risk alleles. Simulated ROC curves overlapped empirical curves with the same AUC. CONCLUSIONS Polygenic risk data can be effectively and efficiently created using a simulation method. This allows to further investigate the potential applications of stratifying interventions on the basis of polygenic risk.
منابع مشابه
Diagnostic Value of Risk Nomogram for the Prediction of Postpartum Hemorrhage Following Vaginal Delivery
Background: Postpartum hemorrhage (PPH) is considered as one of the major causes of maternal mortality worldwide. The most effective risk factors have been suggested in various studies on risk nomogram for the prediction of PPH. Aim: This study aimed to determine the diagnostic value of the risk nomogram for the prediction of PPH. Method: Thi...
متن کاملComparing Three Data Mining Algorithms for Identifying the Associated Risk Factors of Type 2 Diabetes
Background: Increasing the prevalence of type 2 diabetes has given rise to a global health burden and a concern among health service providers and health administrators. The current study aimed at developing and comparing some statistical models to identify the risk factors associated with type 2 diabetes. In this light, artificial neural network (ANN), support vector machines (SVMs), and multi...
متن کاملAcceptance sampling for attributes via hypothesis testing and the hypergeometric distribution
This paper questions some aspects of attribute acceptance sampling in light of the original concepts of hypothesis testing from Neyman and Pearson (NP). Attribute acceptance sampling in industry, as developed by Dodge and Romig (DR), generally follows the international standards of ISO 2859, and similarly the Brazilian standards NBR 5425 to NBR 5427 and the United States Standards ANSI/ASQC Z1....
متن کاملComparison of ordinary logistic regression and robust logistic regression models in modeling of pre-diabetes risk factors
Background: Regarding the increased risk of developing type 2 diabetes in pre-diabetic people, identifying pre-diabetes and determining of its risk factors seems so necessary. In this study, it is aimed to compare ordinary logistic regression and robust logistic regression models in modeling pre-diabetes risk factors. Methods: This is a cross-sectional study and conducted on 6460 people, over ...
متن کاملUrban Inundation Hazard Potential using Evidential Belief Function model (EBF) (Case study: Emam Ali town, Mashhad city)
Inundation in urban areas due to dens storm has created many problems for all cities thorough the world. Urban flood hazard zoning may provide useful information for dealing with contingency and alleviating risk and loss of life and property. Therefore, to management of urban area, flood relief measures and prioritized to address flooding problems should be identified areas that are more aff...
متن کامل